Here we create a stacked bar graph to visualize how many purchases occured by every month for each company. We can clearly the the numbers in white for each customer. The stacked bar graph is in order exactly like the legend so as you move down each square you would follow along with the legend.
Purchase_Data %>%
ggplot(aes(x = format(Date, "%Y-%m"), fill = Company))+
geom_bar(color = 'Black') +
labs(x = 'Month', y = "Count of Purchases by Month", title = "How many Purchases by Companies every Month")+
geom_text(stat = 'count', aes(label=..count..), size = 3, position = position_stack(vjust = 0.5), color="white")
We want find the value of our bills we need to charge at the end of each month. The bill is calculated at the sum of the purchases a company creates minus the value of the payment they made in the same period. In our data payments are denoted as (-) so in our data we do an addition. Finally we want to see what the maximum amout of money we will collect in this year will be.
Bill_Data <- Purchase_Data %>%
group_by(month, Company) %>%
summarise(sum_spend = sum(Spend, na.rm = TRUE), sum_payment = sum(Payment, na.rm = TRUE), Bill = sum_spend + sum_payment)
max(Bill_Data$Bill)
## [1] 9059216
Now we want to see what is the dollar amount of the higest bill for each customer. This is the amount of money that each company owes.
Bill_Data %>%
group_by(Company) %>%
summarise(max_bill = max(Bill))
## # A tibble: 19 x 2
## Company max_bill
## <chr> <dbl>
## 1 Ashley 3797052
## 2 Daniel 5552764
## 3 David 3175100
## 4 Eddy 6547000
## 5 Frank 7165700
## 6 George 4702180
## 7 Helen 5999811
## 8 Jennifer 5535789
## 9 Karen 8128001
## 10 Lizzy 609204
## 11 Mary 7590764
## 12 Michael 4443700
## 13 Nancy 2562226
## 14 Patricia 700211
## 15 Scott 9059216
## 16 Sean 5700200
## 17 Susan 4427353
## 18 Tom 5870400
## 19 Walter 4119100
Now we want to see which companies are the most profitable for us. So we look at the total amount of purchases that is made by each specific company. Then we order them from the largest total purchase to the smallest and return the values we are looking for: The most profitable companies.
Profitable_margin_customers <- Purchase_Data %>%
group_by(Company) %>%
summarise(total_purchase = sum(Spend, na.rm = TRUE)) %>%
arrange(desc(total_purchase))
Profitable_margin_customers[1,]
## # A tibble: 1 x 2
## Company total_purchase
## <chr> <dbl>
## 1 Scott 68201664
Profitable_margin_customers[4,]
## # A tibble: 1 x 2
## Company total_purchase
## <chr> <dbl>
## 1 David 58145300
Profitable_margin_customers[6,]
## # A tibble: 1 x 2
## Company total_purchase
## <chr> <dbl>
## 1 Mary 52393670
Since most companies are not paying their entire orders in one payment, but have spread them out over an amount of time. We need to determine what our exposure is, that is, how much would we be in debt if our customers suddenly went bankrupt. We can see that some months we are (-) which means that these months companies made enough payments to push us into cash flow positive territory.
Bill_Data %>%
group_by(month) %>%
summarise(credit_exposure = sum(Bill))
## # A tibble: 12 x 2
## month credit_exposure
## <chr> <dbl>
## 1 01 53271317
## 2 02 32906230
## 3 03 24391771
## 4 04 20712338
## 5 05 23166734
## 6 06 3896696
## 7 07 - 3090934
## 8 08 -22527905
## 9 09 5832091
## 10 10 42366148
## 11 11 -47902177
## 12 12 - 3728272
We have looked at the most valueable customer when it comes to the total amount of purchases made, but now we are looking for the most valueable customers that are making the most payments on their purchases.
Customer_Payments <- Purchase_Data %>%
group_by(Company) %>%
summarise(total_payments = sum(Payment, na.rm = TRUE)) %>%
arrange(total_payments)
Customer_Payments[1,]
## # A tibble: 1 x 2
## Company total_payments
## <chr> <dbl>
## 1 Scott -63209095
Customer_Payments[4,]
## # A tibble: 1 x 2
## Company total_payments
## <chr> <dbl>
## 1 Mary -51800000
Customer_Payments[6,]
## # A tibble: 1 x 2
## Company total_payments
## <chr> <dbl>
## 1 Sean -49523100
We can break down our information of payments to look at who is making the highest payment by month.
Customer_Payments_month1 <- Purchase_Data %>%
group_by(Company, month) %>%
summarise(total_payments = sum(Payment, na.rm = TRUE)) %>%
group_by(month) %>%
summarise(total_payments = min(total_payments))
Customer_Payments_month2 <- Purchase_Data %>%
group_by(Company, month) %>%
summarise(total_payments = sum(Payment, na.rm = TRUE))
Customer_Payments_month <- left_join(Customer_Payments_month1, Customer_Payments_month2, by = c('month', 'total_payments'))
Customer_Payments_month
## # A tibble: 12 x 3
## month total_payments Company
## <chr> <dbl> <chr>
## 1 01 - 4272000 Lizzy
## 2 02 -10000000 Scott
## 3 03 - 9045600 Frank
## 4 04 - 9000000 Scott
## 5 05 -10000000 Walter
## 6 06 -12169922 Daniel
## 7 07 -13400000 Jennifer
## 8 08 -13068510 George
## 9 09 - 5108500 David
## 10 10 - 6240000 Michael
## 11 11 -45000000 Mary
## 12 12 - 9923400 Eddy
Here we want to see what the total amount of purchases in dollar value is for each company in the billing cycle beginning on the 16th of every month and ending on the 15th.
Billing_Cycle <- left_join(January_Billing_Cycle, February_Billing_Cycle, by='Company') %>%
left_join(., March_Billing_Cycle, by = "Company") %>%
left_join(., April_Billing_Cycle, by='Company') %>%
left_join(., May_Billing_Cycle, by = "Company") %>%
left_join(., June_Billing_Cycle, by = 'Company') %>%
left_join(., July_Billing_Cycle, by= 'Company') %>%
left_join(., August_Billing_Cycle, by = "Company") %>%
left_join(., September_Billing_Cycle, by = "Company") %>%
left_join(., October_Billing_Cycle, by = "Company") %>%
left_join(., November_Billing_Cycle, by = "Company")
Billing_Cycle
## # A tibble: 19 x 12
## Company January February March April May June July
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Ashley 3098183 5745574 3138154 4348063 4111703 2894911 1805020
## 2 Daniel 5456225 4291621 5720926 5752545 6390772 4940389 6886136
## 3 David 2628400 4991700 4945100 5000600 4431500 2214500 5753400
## 4 Eddy 4261600 1369600 5117200 2938300 6096200 6313500 2228200
## 5 Frank 4209800 3626900 9199100 4813100 1254100 2907300 5482600
## 6 George 4755520 1926960 2246930 2627170 5117280 3466190 3450860
## 7 Helen 3568047 5011199 3311189 3927736 3635539 4842149 3960818
## 8 Jennifer 2719514 1572964 2797997 3438740 3562300 4700405 6040675
## 9 Karen 2659484 6536110 6626620 3754103 3879444 1436659 3575490
## 10 Lizzy 5855749 5027807 3960768 1804324 3085114 1635378 1164010
## 11 Mary 3342360 5376971 6827838 4410938 3980587 3242041 4948087
## 12 Michael 2989700 6417600 3788500 2579400 3942500 5546300 3118400
## 13 Nancy 5617931 2673948 4897152 3668491 3387715 3482032 5612720
## 14 Patricia 6038053 2981111 3234581 3738710 2083005 1936890 5131643
## 15 Scott 12252657 3909827 8462021 4997324 4598806 5246486 3186489
## 16 Sean 6280400 3491600 4441500 5407500 4825400 3790000 2399400
## 17 Susan 3972893 2992308 2487797 4302904 1402766 4096346 2547483
## 18 Tom 4363200 5899200 2883800 4415500 4467100 3712900 635700
## 19 Walter 5893200 3083800 5308200 5162900 1810600 2496300 8944200
## August September October November
## <dbl> <dbl> <dbl> <dbl>
## 1 2535477 1346772 2784222 7039906
## 2 3160526 2591146 5102280 3832227
## 3 7170400 4590400 6339000 4653100
## 4 5576500 4020200 4868600 671700
## 5 3600200 3745700 2895000 4855900
## 6 2092020 2261770 1616300 3822770
## 7 2213471 3778384 7238745 3211923
## 8 2138248 3935939 5368405 1634365
## 9 2435576 4453286 3445842 3363250
## 10 2602199 4014915 2463920 4102647
## 11 3038473 2587975 7442342 3927270
## 12 3153300 4720900 4178000 4313000
## 13 4176496 1760638 2159807 6102830
## 14 3977498 4717999 2740970 3272324
## 15 3649871 5208052 4077504 7006828
## 16 4497000 3857800 9199300 5509400
## 17 5343591 3589804 6727223 2599159
## 18 2182300 6500700 4291700 4677800
## 19 3967400 4853200 6625000 3386800
Here we want to see how the purchases of all the companies look across the entire year of 2017. Plotting each respective company we can go ahead and smooth out the data to see if we can determine any form of trend by each company. This will be a start for our machine learning to come for the forecasting of the year of 2018 by each company. We can see some form of patterns in the data as some companies tend to decrease their orders while others have a strong consistency of orders.
Purchase_Data %>%
ggplot(aes(x = Date, y = Spend, group = 1))+
facet_wrap(~ Company, ncol = 2, scales = 'free')+
geom_point(aes(color = day_of_week), na.rm = TRUE)+
geom_line(na.rm = TRUE, show.legend = FALSE, color = palette_light()[[1]], alpha = 0.8)+
geom_smooth(na.rm = TRUE, color = 'blue')+
scale_color_manual(values = palette_light())+
theme_tq()+
labs(color = "Days of the Week", title = "Purchases From The Entire Year by each Company", x = 'Days')
## `geom_smooth()` using method = 'loess'
Here we are forecasting the pruchases of the entire year of 2018. We are not considering specific companies, we keep every else constant and only focus on the total sales. We can see that next year the purchases will start off strong with an increase in purchases and continue to increase until about half-way through the year. Then we can expect that purchases will begin to slow down into December but still the total amount of purchases will be at a much higher average than that of 2017. We can only make this slight prediction since our data set is only for a year so we cannot determine if there is a consistent patter, we would have to analyze more data in order to see if there is some sort of trend of every year there being a total increase in the whole amount of purchases, then proceed to slow down and consolidate.
purchase_spend %>%
select(Date, total_spend) %>%
rename(date = Date) %>%
ggplot(aes(x = date, y = total_spend))+
geom_point(alpha = .5)+
geom_line(alpha = .5)+
geom_ribbon(aes(ymin = lo.95, ymax = hi.95), data = pred_future,
fill = '#D5DBFF', color = NA, size = 0)+
geom_ribbon(aes(ymin = lo.80, ymax = hi.80, fill = key), data = pred_future,
fill = "#596DD5", color = NA, size = 0, alpha =.8)+
geom_point(aes(x = date, y = (total_spend)), data = pred_future,
alpha = .5, color = palette_light()[2], na.rm = TRUE)+
geom_smooth(aes(x = date, y = (total_spend)), data = pred_future,
method = 'loess', color = 'white', na.rm = TRUE)+
theme_tq()